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Abstract 

CO ■ 

We outline here the mathematical expression of a diffusion model for cellphones malware transmitted 
through Bluetooth channels. In particular, we provide the deterministic formula underlying the proposed 

^ . infection model, in its equivalent recursive (simple but computationally heavy) and closed form (more 

qj complex but efficiently computable) expression. 

Introduction 

i— i 1 The spreading of malware, i.e., malicious self-replicating codes, has rapidly grown in the last few years, 

becoming a substantial threat to the wireless devices, and mobile (smart)phones represent nowadays 
the most appetible present and future target. Papers studying the problem from both theoretical and 
technical points of view already appeared in literature since 2005 PUS], and nowadays a number of 
different approaches to modeling the virus diffusion are already available to the community. With the 
present work we want to contribute to this topic by proposing a more accurate model for the spread 
of a malware through the Bluetooth channel, providing both a recursive and a combinatorial equivalent 
deterministic formulation of the described solution. 
00 ; 

(N ! The model 

vq . 

T^j- \ The dynamics of the proposed model is the following: at a certain time r, a number / of infected mobiles 

&i,...,bj come in contact with a number S of clean (non- infected) cellphones wi, . . . , ws] hereafter we 
will denote this configuration as (J, S). 

All S + I telephones are in the Bluetooth transmission range of each other and they all have their 
^ 1 Bluetooth device on. Each infected mobile tries to establish a connection with another device, clearly 

not knowing whether it is trying to pair to a clean or to an infected phone. All these connections are 
established instantaneously at time r. However, for the sake of simplicity we assume that the infected 
mobiles establish connections following a given sequence, starting from b\ down to bj. In other words, 
bi is the first to try to establish a connection, b\ is the last one. Moreover, each connection is chosen 
uniformly at random among all possible available choices. Connections between infected and clean mobiles 
deterministically result in infection transmission: when a clean mobile gets paired to an infected one, it 
becomes infected. All these events occur in the time interval [r, r + Ar], where Ar is the minimal time 
allowing all infected mobiles to establish a connection and eventually transmit the virus: in practice, 
it may be considered of the order of a few tens of seconds. We assume that in this time interval clean 
cellphones do not try to establish any connections, e.g., for non-malware purposes. We also assume that 
in this time interval no other mobile enters the Bluetooth transmission range of the S + I mobiles and, 
when a connection between two mobiles is established, the two mobiles remain connected for the whole 
time interval. Basically, we are assuming that the initial configuration (/, S) is given and it does not 
change in the time interval [r, r + Ar]. Note that, given the definition of Ar, new infections do not result 
in configuration changes in the time interval [r, r + Ar] . 

All the aforementioned assumptions are reasonably realistic, due to the very short time-scale consid- 
ered. 
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The task here is to discover the probability that, in this situation, a given clean mobile gets paired to 
an infected one, and thus it becomes itself infected. 

Summarizing, the setup and the constraints of the model are the following: 

Setup / infected mobiles &i, . . . , bj and S clean mobiles wi, . . . , ws are in a room (i.e., in the Bluetooth 
transmission range of each other). 

Dynamics Starting from b\ down to 6/, each infected mobile tries to connect with a yet unconnected 
device, regardless of whether it is infected or not. 

Constraint #1 Since the connection channel is Bluetooth, once a connection between two mobiles is 
established, these two devices become unavailable to further connection, or, in other words, each 
device can have at most one connection to another cellphone. 

Constraint #2 For each t = 1, . . . , I, when it is b t 's turn to choose, b t must connect to one of the still 
available devices, if any. 

Let us consider the generic configuration (I,S) with / unpaired infected mobiles &i, . . . , 6/ and S 
unpaired clean mobiles wi, . . . , ws- According to the setup, the first mobile establishing a connection is 
b\. In Fig. Q]a possible evolution is displayed starting from an initial configuration with 1 = 7 infected 
and S = 5 clean mobiles, together with an explanatory description of the occuring dynamics. 

Due to the described dynamics, all the infected mobiles succeed in paring, with the exception of at 
most one b Zl which can remain unpaired if there are no more available mobiles. This case can only happen 
when there are more infected mobiles than clean ones, their sum is odd and all the clean mobiles get 
paired: 



where j is the number of pairings between two infected mobiles. Henceforth, the last choosing infected 
mobile b z cannot find any available device to pair to. In what follows, we will refer to this case as the 
case f ; an example of this situation in the initial configuration (7, 2) is shown in Fig. O 

The model is completely described by computing the probability P(J, S) that a certain clean mobile, 
for instance w\ , gets infected in the time interval [r, r + At] . 

Although S) could be stochastically approximated by running repeated simulations, in the follow- 
ing Sections we will derive two equivalent exact (deterministic) formulas for P(J, S) in the aforementioned 
setup. The former is a simple recursive expression, which follows straightforwardly from the model dy- 
namics, while the latter is its corresponding closed form (thus with no recursion involved), which has 
a more complex expression and it heavily relies on combinatorics. Other than their alternative mathe- 
matical nature, the two formulae show different behaviours also from a computational point of view, as 
discussed in a dedicated Section. 




(t) 
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Figure 1. An example of model dynamics starting from the initial configuration (7,5). In 
red, the pairing that it is established at each step, (a) At time r, I = 7 infected mobile phones 
&i, . . . , &7 and 5 = 5 clean mobiles w\, . . . , are all within their mutual Bluetooth connection range, 
(b) bi chooses a mobile among 62, • • • , &7, wi, • • • , ^5; it chooses w\ establishing connection O. (c) Now it 
&2's turn to choose, and b\ and w\ are not available anymore for pairing (marked by a grey circle ©). 
(d) 62 connects to 63 through pairing ©. (e) The two mobiles 62 and 63 become unavailable for pairing, 
too and the next infected mobile in line 64 pairs to W2 via €>. (f) Only £>6, 67 and W3, w±, w$ remain 
available for pairing with 65, which chooses 67 (connection 0). (g) Now the last mobile b$ must connect 
to the remaining unpaired clean phones 1^3, w±, w$: it chooses W4 creating pairing ®. (h) There are no 
more unpaired infected mobiles: the process ends at time r + At. 
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Figure 2. An example of the f situation. Starting from the initial configuration (7,2), b\ infects 
the clean mobile w\, 62 pairs to 63, 64 infects W2 and, finally, 65 pairs to 67. Here the process ends, 
because there are no more mobiles available for pairing to be which remains unconnected. 

The recursive formula 



Recursively, the probability P(I, S) of a given susceptible mobile w t to get infected starting from a given 
initial configuration (J, S) can be written by the following expression: 



1 



5-1 



p (^ s )- I + S -i 1 I + S-l 
P(0,S) = 
P{I,0) = 

P(l,5) = |. 



P(I- 1,5-1) 



1-1 
I + S-l 



P(I-2,S) 



(1) 



where the trivial conditions P(0,5) = 0, P(I, 0) = and P(l,5) = 1/5 initialize the recursion, thus 
covering all possible cases. 

Since all clean mobiles share the same probability P(I, S) of getting infected, without loss of generality 
we may assume w t =w\. The three terms , / ^^ 1 P(i— 1, 5—1), and / ^^ 1 P(i— 2, 5) contributing 

to the general case of P(I, 5) come from the three mutually exclusive cases which can occur starting from 
the initial configuration (1,5): 

1. b\ establishes a pairing with w±. In this case w± gets infected and this event occurs with probability 

l 

7+5-1" 

2. bi establishes a pairing with one of the other 5 — 1 clean mobiles wi, • • • ,^5- This event occurs 
with probability (5 — 1) • and of course w\ does not get infected by b\. However, w\ may be 
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infected later by the remaining I — 1 available infected phones (with only S — 1 clean mobiles still 
available, because one clean mobile has been infected by 6i), thus falling back to a (I — 1, S — 1) 
configuration. 

3. bi establishes a pairing with one of the other I — 1 unpaired infected mobiles 62,..., 6/. This 
event occurs with probability (I — 1) • and of course w\ does not get infected by b\. However, 

similarly to the previous situation, w\ may be infected later by the remaining 1 — 2 unpaired infected 
phones, thus falling back to a (/ — 2, S) configuration. 

A worked out example illustrating the construction of Eq. [1] is shown in Fig. [3l 

The formula in Eq. [T] for P(i, S) relies on a recursive equation of second order with non constant 
coefficients, for which no general method is known to derive the corresponding non-recursive (closed) 
expression. Moreover, as detailed in a later Section, calculating P(i, S) by using Eq.[T]is computationally 
heavy. However, we will obtain the equivalent time-saving closed form solution in the next Section using 
combinatorial arguments. 



The combinatorial formula 

To construct the explicit formula equivalent to Eq. [TJ we need to employ a few combinatorial considera- 
tions. The key observation is that we can count all wirings (lists of pairings) that can occur at the end 
of the pairing process. Clearly, the fact that there is an order in setting up the connections between the 
mobiles heavily influences the probability that a given wiring can occur: in particular, this probability 
depends on the number j of pairings between infected mobiles (bb-pairings, for short). As background 
material, we recall some definitions and results from combinatorics in the box in Fig. [4j together with 
the two following functions: 



the Heaviside step function 



the Kronecker delta function 



H(x) 



S(x) 



j 1 for x > 

[0 for x < ; 

1 for x = 

for x ^ , 



As an example, the following indicator function can be written in the two equivalent formulations: 

f(i,s,j) 



1 in the f case 
otherwise 



= H(I-S- 1)(J((J + 5 + 1) mod 2)S(2j - I + S + 1) , 

where mod is the Euclidean remainder function, so x mod 2 is zero for even x and one for odd x. 
Suppose now we are starting from an initial configuration (/, S); then define the following quantities: 

• L(7, S): the minimum number of bb-pairings in a wiring; 

• P(i, 5, j): the probability that a wiring with exactly j bb-pairings occurs; 

• iV(7, 5, j): the number of all possible ways to select j bb-pairings; 

• N W (I, S,j): the number of all possible wirings with a given list of j bb-pairings when a (generic) 
clean mobile gets paired; 
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Figure 3. Construction of the general case of the recursive formula Eq. [T] Starting from the 
initial configuration (1,5), we want to compute the probability P(I,S) that a clean mobile (w\ without 
loss of generality) gets infected in the proposed model. At time r, the first infected mobile b\ tries to 
establish a pairing, and only one of the three following alternatives can occur. In green, the case when 
b\ immediately infects w\ (with probability ) and we are done. In blue, the case when b\ pairs to 

one of the remaining another I — 1 infected mobiles b t with probability j^fzi 5 then bi and b t becomes 
unavailable for pairing with the following choosing mobile 62, and we are moved into the case of 
computing the probability that w\ gets infected when there are 1 — 2 unlinked infected mobiles and 5 
clean ones, i.e., P(I — 2, 5). Finally, in orange, the case when b\ pairs to one of the other 5 — 1 clean 
mobiles wt (with wt ^ w\) with probability 7x^1 ] then b\ and wt becomes unavailable for pairing with 
the following choosing mobile 62, and we are moved into the case of computing the probability that w\ 
gets infected when there are I — 1 unlinked infected mobiles and 5 — 1 unlinked clean ones, i.e., 
P(I -1,5- 1). The general case P(J, 5) = jqp^i + yfc^P(/ -1,5-1) + j^zjP(I - 2, 5) is 
obtained by summing the contributions of all three alternative cases described above. 



Definitions 



• The combinations C(M,T) are the different selections of T elements from an original universe U of M 
objects regardless of the ordering. 

• The dispositions (or combinations with ordering) D(M,T) are the selections of T elements from an 
original universe of M objects where different orderings correspond to different dispositions. 

• The permutations P(T) are the different orderings (anagrams) of T elements. 



'C(3,2)^| ( D(3,2) ^ ( P(3) ' 

\Q- j\ OO OO 030 OOO 

(O'Ol OO OO Q&& OOO 

IQ-Q}) l©0 OOj loOO OOO 



Cardinalities 



. |D(M,r)| = p ^L 
. \P(T)\ =T! 

. \C(M,T)\ = ($) = \D(M,T)\/\P(T)\ 

• Among all combinations of M objects in groups of T elements, a particular element is selected exactly 
\C(M-1,T- 1)| times. 



Figure 4. Basic definitions, examples and facts on dispositions, combinations and 
permutations. 
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• N(I , S, j,w t ): the number of all possible wirings with a given list of j bb-pairings and where the 
clean mobile w t is paired; 

• W W (I, S,j) = N(I, S,j) • N W (I, S,j): the number of all possible wirings with j bb-pairings when a 
(generic) clean mobile gets paired; 

• W{I, *S, j, wt) = N(I,S,j) • N(I, S,j,wt): the number of all possible wirings with j bb-pairings 
where the clean mobile w t is paired; 

• iV+(J, 5, h): in the f case, with j > 2, the number of possible wirings with bh unpaired, for h < I. 

In the above notations, the (non recursive) closed form expression equivalent to Eq. [I] for the proba- 
bility S) of a given susceptible mobile w t to get infected in a given initial configuration (/, S) can 
be written as follows: 



j)H[j 



P(I,S)= H 

j=H(I-S-l) '- S -V+ S) mod 
I-j-l-H(I-S-l)6((I+S+l) mod 2)S(2j-I+S+l) 

n 

k=0 



I-S-(I + S) mod 2 



7 + 5- 1 -2k 



(1 - H(I - 5 - 1)S((I + 5 + 1) mod 2)6(2 j -I + S+ 1))— — 

1 



1) 



■H(I-S- 1)S((I + 5 + 1) mod 2)6(2j -7 + 5+1)- 



+ *(?- i)l ( 2 l-i 



^(i-2) 



k=0 
J-2 



h=S+l 



I -I -2k 



I -3 -2k 



H(h-S- l)H(I - 2 - h) \6(j - I + h) 



H ^- I+h - l) TrTTw: 



1 



I-h 



(j-i + h)\ 



Y[(h-k) 



k=i 



I-h 



3-I+h 



ii(h-k) n 



k=l 



d=l 



2h-I- 1 -2{d- i; 



(/ - 2j)! ( 7 _f 1 ^. ) (1 - ff (/ -5-1 )S((I + 5+1) mod 2)<5(2j - / - 



5!77(7 - 5 - l)6((I +5+1) mod 2)6(2j - I + 5 + 1) 



■5 + 1))+ 



Eq. [2] has its roots on the following counting argument: the probability that a given clean mobile w t 
gets infected is the sum over all admissible values of j of all possible wirings with j bb-pairings weighted 
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by the probability that a wiring with exactly j bb-pairings occurs: 

UJ 

P(I,S)= Yl P(I,S,j)-W(I,S,j,w t ) 

= Y, P{I,S,j)-N(I,S,j)-N(I,S,j,w t ) , 

j=UI,S) 

where L(I, S) is the minimum number of 66-pairings that can be established in an initial configuration 
(I,S). 

The rationale of summing over the number of 66-pairings to compute P(I, S) relies on the observation 
that the probability of w t of getting infected depends on the number of available infected mobiles that 
will pair with clean mobiles, that is exactly the number of infected mobiles which are not already paired 
to another infected mobile, i.e., that are not involved in a bb-pairing. 

In particular, the three terms between brackets in Eq. [2] match respectively the three factors in Eq. [3j 
while the term between double brackets ([, ] to enhance readability) corresponds to N^(I, 5, h). 

In what follows we will show that the expansion of the right-hand member of Eq. [3] coincides with Eq[2j 
The expansions of all terms will be carried out first by separately considering all occurring cases, and 
then providing an unique closed form formula (without conditional expressions) by using the Heaviside 
step and the Kronecker delta functions. 

Lemma 1. Given an initial configuration (I,S), the minimum number L(I, S) of bb-pairings in a wiring 
is the following: 

'O forI<S 
L(J, S) = < ^ for I > S, I - S e 2Z 

^ Mp± for I > S, I — S G2Z + 1 

= g( /-s-i) J - 5 - (/ 2 +5)mod2 , 

while the maximum number is |_|J . 

In fact, while when / < S it is possible not to have any bb-pairing, when I > S they cannot be less 
than or respectively when / — S is even or odd. This is due to the constraint #1 imposing 

that an infected mobile b t must connect to another device whenever available, when it is its turn to 
choose. □ 

Lemma 2. Given a (I,S) configuration, the probability P(I,S,j) that a wiring with exactly j > bb- 
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pairings between two infected mobiles occurs is the following: 
'0 



P(I,S,j) 



I-j-l-z 

n 

fc=0 



I + S 



2k 



H 



-3 H[j- 



j=H(I-S-l) J - s - (7 + s) mod2 
I-j-l-H(I-S-l)6((I+S+l) mod 2)5(2j-I+S+l) 

n 

fc=0 



if j > J or 
if 3 < ^ when I > S and I -\- S £ 27L or 
if j < ^=§=± when I > S and I + S G 2Z - 



otherwise, 

with 2 = 1 in the case f and elsewhere 

I-S-(I + S) mod 2 



1 



I + S-l-2k 



In fact, when there are j bb-pairings in the admissible range, all possible wirings depend on the choice 
of j infected devices b and / — 2j clean devices w, i.e. I — j elements from the original sets of / + S. The 



first element has probability 



to be chosen, the second 



the third 



and so on. 



□ 



7+5-1 uw ^ U11C OCWiiU 1+5-3' U11C *>mm i+S-5 

Lemma 3. Given an initial configuration (/, S) in the f case with j > 2, then the number N^(I, 5, h) of 
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possible wirings with bh unpaired, for ft < I ' , is: 

^nW-i-^i 



_—(7-2)niC(/-3-2fc,2)| 



/c=0 



JV t (/,S,ft) = < 



I-h 



\P(j-I+h)\ l\ 



for ft = I 

for ft = I - 1 

for5+l<ft<I-2 
and j < I — h 

forS+l<ft<I-2 
and j = I — h 



1 /-/i 

|P(j-j + /t )| lift-*) 11 |C(2ft-7-l-2(t-l),2)| forS+l</i</-2 



and j > I — h 



for ft < S 



^-^n v 2 



fc=0 



J — 1 - 2fc 



■'<*-'+%V- s >nt 



J - 3 - 2/c 
2 



■H(h-S-l)H(I-2-h) [Hj-I + h) { ._ I + h)] 



l[(h-k) 



k=l 



H(j-I + h- 1)^— y^yy (ft - *) Jl ( 



2ft-/- 1 -2(d- 1) 
2 



The idea is that all the J — ft infected mobiles 6^+i, . . . , 6/ must be part of a bb-pairing, so they 
must be connected to one of the foi, . . . , 6^-1. Once they have been chosen, the remaining j — (I — ft) 
bb-pairings must be selected among the mobiles &i, . . . , fr^-i that are yet unpaired. Both considerations 
can be exploited in terms of combinations using the definitions and the properties of Fig. HI □ 
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Lemma 4. In the (J, 5) configuration, the number of all possible ways to select j bb-pairings is: 
n|C(7-2(fc-l),2)| 



N(I,S,j)={ 



k=l 



apart from the f case 



\p(j)\ 

1 in the f case with j = 

|C(J, 2)| - 1 in the f case with j = 1 

N^(I,S,h) in the f case with j > 2 

^=5+1 



= (1 - H(I - 5 - 1)S((I + 5 + 1) mod 2)S(2j -7 + 5 + 1)) 



3 

n 

fe=i 



J — 2(fe — 1) 



+ H(I-S- 1)5((I + S + 1) mod 2)*(2 j -7 + 5 + 1) (<5(j) + <5(j - 1) - 1 

+ 77(j-2) £ iVt(/,5,/i)J • 

Apart from the f case, selecting j bb-pairings is equivalent to consecutively choosing j unordered pairs 
b r \b s from the original set of I infected mobiles. The first pair can be chosen in |C(7, 2)| ways, the second 
pair in \C(I — 2, 2)| and so on. The division by \P(j) \ is motivated by the fact that the particular ordering 
in which the j pairs are chosen is irrelevant: the list 6i|&2? &5I&6 is undistinguishable from the list 
^5^6^i|^2, &3 1 &4 . The number of these different ordering is precisely \P(j) \ by definition of permutations. 
In the f case, if j = there is only one way to choose bb-pairings, while if j = 1 the unpaired infected 
mobile can only be 6/, so from |C(I, 2)| we have to subtract the case where the only bb-pairing involves 
6/, which is impossible. Finally, in the f case with j > 2 the unpaired infected mobile can be any bh 
with S + 1 < h < J, and the total number of cases (which coincides with the number of cases where 
b t is selected, since all the clean mobiles are connected in these situations) is the sum of all cases with 
/i = 5 + l,...,J. □ 

Lemma 5. In the (7, S) configuration, with j bb-pairings, the number of all possible cases when a par- 
ticular wt is chosen is: 



N(I,S,j,w t ) 



j\P(S)\ in the f case 

\\P(I-2j)\ ■ 1(7(5-1,7-2^-1)1 otherwise . 



= (V - 2j)\ ( 7 f x _ X 2 ^ (1 - H(I - 5 - 1)S((I + 5 + 1) mod 2)5(2 j - I + 5 + 1)) + 

+ S\H(I - 5 - 1)6((I + 5 + 1) mod 2)6(2 j -I + S + l] 

The result follows immediately from the cardinality equations in Fig. HJ in particular from the fact 
that among all combinations of M objects in groups of T elements, a particular element is selected exactly 
\C(M - 1, T - 1)| times. When I is even and j = f we follow the convention (^) = for A > 0, B < 0. 
In case f, since all the non infected mobiles are selected, the possible ways to select them are exactly 
their permutations. □ 

This completes the expansion of Eq. [3] into Eq. O 

Equivalence between the recursive and the closed formula can be proven by showing that Eq. [2] 
satisfies the recursive relations of Eq. [TJ The analytical proof of the equivalence involves working out a 



13 



large number of cumbersome identities of binomial coefficients and factorials: in the last Section, we will 
briefly outline a sketch of the proof in the simple case I = S <E 2Z. Numerically, the differences between 
the two formulae are below machine precision for 1 < /, S < 50. 

We conclude the Section with the observation that the sum of the total number of cases weighted by 
their corresponding probabilities adds up correctly to one: 

L*J LiJ 

^ P(I,S,j).W w (I,SJ)= J2 P(I,S,j).N(I,S,j).N w (I,SJ) = l, 

3=L(I,S) 3=L(I,S) 

because of the following counting lemma. 

Lemma 6. In the (I,S) configuration with j bb-pairings, the number N w (/, S,j) of all possible ways to 
select the remaining clean mobiles for pairing is: 



N W (I,SJ) 



(\P(S)\ in the f case 

\\D(SJ-2j)\ otherwise . 



Apart from the f case, when there are j bb-pairings, / — 2j infected mobiles remain to be connected 
with / — 2j clean devices. This is equivalent to compute the number of possible sets of I — 2j elements 
from an initial set of S clean mobiles: since here the ordering matters, this is the definition of dispositions 
(see Fig. 2]) of / — 2j elements from an original set of S. □ 

Note that, since in the case f all the clean mobiles are selected, the two quantities N w (I,S,j) and 
N(I, 5, j, Wt) coincide. 



Analytical and computational notes 

Although defined only for positive integer values of / and 5, it is possible to provide a graphical sketch 
of the shape of the function P(I,S) by linear interpolation on the non integer real values. In Fig. [5] we 
show both the tridimensional surface of P(J, S) and its corresponding contourplot for values of / and S 
ranging between 1 and 100. 

Asymptotically, the function P(I, S) converges to the following limits: 

lim P(I, S) = 1 lim P(I, S) = lim P(I, S) = ^ (4) 

I=S 

Graphical examples of the behaviour stated in Eq. [4] are provided in Fig.[6j where a few curves of P(J, S) 
are plotted when one of the two parameters is kept constant (and equal to 10, 50, 100) and the other 
ranges between and 100, together with the curve corresponding to P(J, S) for 1 < / = S < 100. When 
one of the two parameter is equal to a constant T, the smaller is T, the faster P(J, S) converges to the 
limits in Eq. HI 

Apart from its intrinsic theoretical relevance, the non recursive closed formula is essential for numer- 
ically compute P(J, S). In fact, the computational cost is notably different by using either the recursive 
formula Eq. [3] or its closed form counterpart Eq. [2J namely, the explicit formula is much faster, as shown 
by the values reported in Tab. Q] and the curves plotted in Fig. [71 

For the recursive formula the computing time shows an exponentially growing trends for increasing 
values of / and 5, while for the non recursive formula the computing time is very small and minimally 
growing for / and S ranging between and 100. Actually, the average time over 10 values using a 
Python implementation of the non recursive formula on a 24 core Intel Xeon E5649 CPU 2.53GHz Linux 
workstation with 47 GB RAM is 11 milliseconds for / = S = 5 and 60 milliseconds for / = S = 10, 




Figure 5. Tridimensional surface (a) and corresponding levelplot (b) of P(I, S) for 
1 < I, S < 100, linearly interpolated on the real non integer values. 
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Figure 6. Plot of curves of P(I, S) for different configurations (I, S). In blue, we show three 
curves of P(7, S) for constant / (/ = 10 solid line, / = 50 dashed line and / = 100 dotted line) and S 
ranging from to 100. All three curves approach the asymptotic value for increasing 5, more rapidly 
for smaller values of /. In black, we show the symmetric cases obtained keeping S constant (S = 10 
solid line, S = 50 dashed line and S = 100 dotted line) and letting / range from to 100. Again, all 
three curves approach the asymptotic value 1 for increasing /, more rapidly for smaller values of S. The 
sawtooth shape of the curve P(7, 10) for / > 30 is due to the effect of the f case, which induces abrupt 
differences in P(i, S) for consecutive values of / (changing from even to odd). Finally, the 
dotted-dashed red line shows the curve of P(i, S) for / = S ranging between and 100: in this case, the 
curve gets very close to its asymptotic value 0.5 even with small values of I = S; for instance, 
P(10, 10) ~ 0.52 and P(25, 25) ~ 0.51. 
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Figure 7. Plot of the computing times (in log scale) needed to compute P(I, S) for different values 
of I = S as listed in Tab. [TJ Error bars range between minimum and maximum, while lines connect 
mean values; all values refer to 10 replicates. Solid line represents computing times obtained by using 
the recursive formula Eq. [TJ while dotted line corresponds to the values produced by using the closed 
formula Eq. [2j 
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Table 1. Computing times (in seconds) required to compute P(I, S) by the recursive formula 
in Eq. [T] and the equivalent closed formula in Eq. [2j for different values of the number of infected (I) and 
susceptible (S). In particular, I = S = 5 . . . 100, and only the closed formula was used for /, S > 50 (due 
to the excessively long runtimes: e.g., computing P(50, 50) by the recursive formula took more than 9 
hours). Mean, maximum (Max) and minimum (Min) values for 10 replicates of each experiment are 
reported. All simulations were run on a 24 core Intel Xeon E5649 CPU 2.53GHz workstation with 47 
GB RAM, Linux 2.6.32 (Red Hat 4.4.6), with software written in Python 2.6.6. 



I=S 




Recursive 




Closed Form 




Min 


Mean 


Max 


Min 


Mean 


Max 


5 


0.012 


0.012 


0.013 


0.011 


0.011 


0.012 


10 


0.012 


0.013 


0.013 


0.011 


0.012 


0.012 


15 


0.013 


0.013 


0.014 


0.011 


0.011 


0.012 


20 


0.031 


0.031 


0.032 


0.011 


0.011 


0.012 


25 


0.223 


0.229 


0.235 


0.011 


0.011 


0.012 


30 


2.365 


2.449 


2.491 


0.012 


0.012 


0.012 


35 


26.203 


26.757 


27.419 


0.012 


0.013 


0.013 


40 


361.621 


362.351 


362.894 


0.014 


0.014 


0.014 


45 


3225.718 


3287.492 


3333.242 


0.015 


0.015 


0.015 


50 


34336.694 


34433.664 


34555.204 


0.016 


0.015 


0.016 


55 








0.018 


0.018 


0.019 


60 








0.020 


0.021 


0.021 


65 








0.023 


0.023 


0.023 


70 








0.026 


0.027 


0.027 


75 








0.030 


0.030 


0.030 


80 








0.035 


0.035 


0.035 


85 








0.039 


0.040 


0.040 


90 








0.046 


0.046 


0.046 


95 








0.052 


0.052 


0.052 


100 








0.060 


0.060 


0.061 
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with very limited standard deviation. On the same hardware, a Python implementation of the recursive 
formula took about 12 milliseconds for P(5,5), 2.4 seconds for P(30,30), 6 minutes for P(40,40) and 
more than 9 hours for P(50, 50), which was the largest tested value. 



Proof of equivalence in the case I = S G 2Z 

In this Section we show the kind of arguments involved in proving the equivalence between Eq. [1] and 
Eq. [2] by outlining the main steps of the proof in a simple case, i.e., when there as many infected as clean 
mobiles, and their numnber is even. Clearly, the general case is computationally far more complex, but 
it used the same ideas. 

Proving the equivalence between the recursive and the combinatorial formula requires substituting the 
explicit expression for P(J, S) of Eq.[2]in its three occurrences in Eq.Q] We are assuming / = S = 2x G 2Z, 
thus in this case the identity we need to prove reads as follows: 



P(2x,2x) 



1 



9r — 1 9r — 1 

-P(2x - 1, 2x - 1) + -P(2x - 2, 2x) 



Ax - 1 Ax -1 Ax - I 

or, equivalent ly: 

U(x) = (Ax - l)P(2x, 2x) - (2x - l)[P(2x - 1, 2x - 1) + P(2x - 2, 2x)\ = 1 . 
The expression for P(J, S) becomes: 



(5) 



4x 



1-2*7"! 11 



2x-2k + 2\, \ % f 2x-l 



x-1 2x-j-l 

P(2x,2x) = J2 n 

j=0 k=0 

- e 2 n * n<* - * - 1) n<- - » + - 2m * - 1)! 

j=0 k=0 J ' k=l k=l 

x-1 



(2x - 1 - 2j)\(2j)\ 



3=0 



(2j-l)!! 1 x\ (2s -1)!! 
(Ax - 1)!! j! (x - j)\ (2x - 2j - 1)!! 



(2x - 2j) 



(2a;- 1)! 

m 



where the upper bound is x — 1 since the right-hand member vanishes for j = x and the product symbols 
were eliminated by using the factorial and double factorial notations: 



n 



b\ 



(a -1)1 



i\\ = < 



1 



for n = 0, — 1 



Y[ (n - 2*) = n • (n - 2) • (n - 4) • • • 3 • 1 for n G 2Z >0 - 1 



3=0 



Y[ (n - 2k) = n • (n - 2) • (n - 4) • • • 4 • 2 for n G 2Z >0 . 
U=o 

Analogously, the expansions for P(7 — 1,5— 1) and P(I — 2, S) become respectively: 



P(2x- l,2x- 1) = ^ 



P(2x-2,2x) = ^ 



(2j-l)!! l (x-1)! (2x-l)H (2g - 2) 

(4x - 3)!! j! (x-j- 1)! (2x - 2j - 1)!! 1 J } 

(2j + l)!! 1 (x-1)! (2a? -3)!! 



^ (Ax - 3)!! j! (x-j - 1)! (2x - 2j - 3)!! 



(2x - 2j - 2) 



(2x- 1)! 
(27+2)1 
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Then the left-hand member of Eq. |5] reads as follows: 

^ (2j - l)!!(x)!(2x - l)!!(2ar - 2j)(2x - l)\(4x - 1) 



U(x) 



E 

3=0 



{Ax - l)!!(j)!(* - j)\(2x - 2j - l)\\(2j)\ 

(2j - l)\\(x - l)\(2x - l)\\(2x - 2j - l)(2x - 2)\(2x - 1) 

{Ax - 3)\\{j)\{x - j - 1)1(2* - 2j - l)!!(2j)! 
(2j + l)!!(x - l)!(2x - 3)!!(2x - 2j - 2)(2x - l)\(2x - 1) 
(4a; - 3)!!(j)!(x -j- l)\(2x - 2j - 3)!!(2j + 2)! 

which, collecting common factors, reduces to: 



U(x) = 

x-1 

= E 



2x 



(2j-l)!!(x-l)!(2x-l)!!(2x-l)! 
(4x - 3)!!(j)!(x - j - l)\(2x - 2j - 3)!!(2j)! [2x - 2j - 1 

(2j - l)\\x\(2x - l)\\(2x - l)!(4j - 2x + 3) 
(4x - 3)!!(j + l)!(x - j - l)\(2x - 2j - l)\\(2j)\ 



x-j-1 
3 + 1 



Now, expanding the double factorial by the identity: 



(2n-l)!! = 



(2n)! 



2 n n ' 

and carrying the terms not involving j outside the summation symbol, the above quantity becomes: 



U(x) 



~)2x-l 



((2x-l)\) 2 (2x)\ 



2(2x- 1))! 
Now, applying the following identity 

(2n)! 
2^(n!) 2 (n + l) 

to Eq. [6] with n = 2x — 1, we obtain that 
as claimed. 



x-1 

E 



(x- j)(4j-2x + 3) 
^(j + l)O'0 2 (2(^-j))! ' 



(6) 



1)! ^ 



4z-n + 2 



z=0 



2 2z + 1 (z + l)(z\) 2 (n-2z)\ 



U(x) 



□ 
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